Multivariate Dyadic Regression Trees for Sparse Learning Problems
نویسندگان
چکیده
We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smoothness of the true regression functions, and achieve the nearly optimal rates of convergence (in a minimax sense) for the class of (α,C)-smooth functions. Empirically, MDRTs can simultaneously conduct function estimation and variable selection in high dimensions. To make MDRTs applicable for large-scale learning problems, we propose a greedy heuristics. The superior performance of MDRTs are demonstrated on both synthetic and real datasets.
منابع مشابه
CORT: classification or regression trees
In this paper we challenge three of the underlying principles of CART, a well know approach to the construction of classification and regression trees. Our primary concern is with the penalization strategy employed to prune back an initial, overgrown tree. We reason, based on both intuitive and theoretical arguments, that the pruning rule for classification should be different from that used fo...
متن کاملMammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease
Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...
متن کاملLearning with Mixtures of Trees
One of the challenges of density estimation as it is used in machine learning is that usually the data are multivariate and often the dimensionality is large. Operating with joint distributions over multidimensional domains raises specific problems that are not encountered in the univariate case. Graphical models are representations of joint densities that are specifically tailored to address t...
متن کاملLearning Linear Dependency Trees from Multivariate Time-series Data
Representing interactions between variables in large data sets in an understandable way is usually important and hard task. This article presents a methodology how a linear dependency structure between variables can be constructed from multivariate data. The dependencies between the variables are specified by multiple linear regression models. A sparse regression algorithm and bootstrap based r...
متن کاملDeblocking Joint Photographic Experts Group Compressed Images via Self-learning Sparse Representation
JPEG is one of the most widely used image compression method, but it causes annoying blocking artifacts at low bit-rates. Sparse representation is an efficient technique which can solve many inverse problems in image processing applications such as denoising and deblocking. In this paper, a post-processing method is proposed for reducing JPEG blocking effects via sparse representation. In this ...
متن کامل